Investigating dark energy experiments with principal components 
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We use a principal component approach to contrast different kinds of probes of dark energy, and 
to emphasize how an array of probes can work together to constrain an arbitrary equation of state 
history w(z). We pay particular attention to the role of the priors in assessing the information 
content of experiments and propose using an explicit prior on the degree of smoothness of w(z) 
that is independent of the binning scheme. We also show how a figure of merit based on the mean 
squared error probes the number of new modes constrained by a data set, and use it to examine 
how informative various experiments will be in constraining the evolution of dark energy. 

PACS numbers: 



I. INTRODUCTION 

There is growing evidence indicating that the expan- 
sion rate of the Universe is accelerating, either due to 
modified gravitational physics or some new type of re- 
pulsive 'dark energy' coming to dominate the Universe. 
Evidence for this comes from several directions. First, 
measures of the present dark matter density show that 
it is unable to explain the observed Hubble expansion 
rate [1, 2]. Second, probes of the past expansion rate us- 
ing high redshift supernovae (SN) as standard candles 
directly show acceleration [3, 4], and this is confirmed by 
the angular size of features in the cosmic microwave back- 
ground (CMB) [5] and the baryon acoustic features in 
the galaxy power spectrum [6]. Finally, CMB-large scale 
structure correlations (see e.g. [7]) provide evidence that 
the growth rate of fluctuations, which depends directly 
on the background expansion history, is inconsistent with 
a matter dominated universe. 

Studies are underway to improve the observational pic- 
ture: SN surveys will be expanded and pushed to higher 
redshifts, microwave background and cluster studies will 
improve the constraints on the present matter density, 
and gravitational lensing and high redshift large scale 
structure surveys will probe directly the growth rate. 
These will seek to answer some fundamental questions: 
Are the data consistent with a cosmological constant, or 
is there evidence for some kind of dynamical dark energy, 
such as quintessence? Are the direct probes of the back- 
ground expansion history consistent with the growth of 
perturbations for some dark energy model, or is a modi- 
fication of gravity required? 

These questions would be straight forward to answer 
given a specific dynamical dark energy model, or even a 
family of such models. However, there is no favored dy- 
namical model. Even in the quintessence models, where 
the dark energy is due to a scalar field rolling down a 
potential well, virtually any behavior of the equation of 
state w(z) is possible by choosing the appropriate poten- 
tial [8]. This makes it essentially impossible to find an 
experiment which will conclusively prove that the dark 



energy is or is not dynamical. Even if we find that data 
are consistent with a cosmological constant, it is very 
likely that there would still exist a family of dynamical 
DE models which would improve the fit to the data. The 
question would remain whether any of these models is 
expected on theoretical grounds. 

Another important question is whether the constraints 
from measures of w(z) coming from the background ex- 
pansion (CMB, SN) are consistent with those arising 
from the growth of perturbations (correlations, weak 
lensing), since inconsistencies could potentially indicate 
a breakdown of general relativity on large scales [9]. If 
observations appear inconsistent, it might be explained 
by a high frequency feature in w(z); again, we must rely 
on theory to determine how baroque this solution is, or 
whether modified gravity is more natural. 

Given this fundamental difficulty, we choose to focus 
instead on the observations and what they might poten- 
tially tell us, using a principle component approach intro- 
duced in the dark energy literature by Huterer and Stark- 
man (HS) [10] and subsequently used by [11]. Any study 
of dynamical dark energy models unfortunately must be- 
gin with some kind of parameterization which implicitly 
imposes a measure on the space of models. For this dis- 
cussion we focus on parameterizing w(z) alone, though 
in principle we might also include the dark energy sound 
speed. 

Choosing a parameterization is necessarily arbitrary 
and answers depend on the choice [12, 13]. We choose 
our basis following a few simple principles. First, it should 
have enough freedom to be able to reproduce most of the 
phenomcnological w(z) used in the literature, as well as 
the w(z) typically derived by the potentials which have 
been considered (e.g. [14]). Second, we would like to keep 
as few parameters as possible, as long as they are capa- 
ble of capturing the effect of w(z) on the observations. 
To implement these, we first assume that w(z) is rea- 
sonably smooth and bin in finite redshift bins which im- 
plicitly excludes variations on time-scales smaller than a 
bin width. We also cut off the coverage at high redshift, 
since no experiment is likely to constrain w(z) at these 
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redshifts (unless w(z) becomes larger than zero at some 
point.) 

We choose 40 uniform redshift bins, stretching to a 
maximum redshift of z = 3. We have checked that our 
results are relatively independent of the size of the bin 
and whether we use a linear or logarithmic binning. The 
results are also little affected if we allow the high redshift 
(z > 3) equation of state to vary or if we fix it at the fidu- 
cial value. To avoid w(z) with infinite derivatives, each 
bin rises and falls following a hyperbolic tangent function 
with typical dz of order 10% of the width of a bin. As the 
fiducial model, we choose a constant Wq = —1.0 model; 
this is not only consistent with the present data, but 
also can be used as a fiducial model for most other phe- 
nomenological paramcterizations considered so far. Since 
dark energy perturbations (DEP) play a crucial role in 
the parameter estimation [15, 16], we use a modified ver- 
sion of CAMB which allows us to calculate DEP for an 
arbitrary w(z) consistently [17]. 

For the principle component analysis, we calculate 
the Fisher matrix based on four kinds of experiments: 
supernovae surveys, CMB anisotropics, galaxy counts 
(GC) correlation functions and weak lensing (WL) ob- 
servations; we also include possible cross-correlations be- 
tween these, including CMB-galaxy and CMB-wcak lens- 
ing (which are sensitive to the integrated Sachs- Wolfe ef- 
fect) as well as galaxy- weak lensing correlations. We con- 
sider the limits which might be obtained in a decade's 
time, by experiments like JDEM [18], Planck [19] and 
LSST [20]. 

We follow the HS conventions for the principle compo- 
nent analysis, but our underlying approach is Bayesian 
rather than frequcntist. We first calculate the Fisher ma- 
trices for each of the experiments, i 7 ^-, where the indices 
i,j run over the parameters of the theory, in our case the 
binned Wi(z). We find the normalized eigenvectors and 
eigenvalues of this matrix (ei(z), A,), and so can write 

F = W T AW, (1) 

where the rows of W are the eigenvectors and A is a 
diagonal matrix with elements A^. The Fisher matrix is 
an estimate of the inverse covariance matrix we expect 
the data to give us and the eigenvalues reflect how well 
the amplitude of each eigenvector can be measured. The 
true behavior of the equation of state can be expanded 
in the eigenvectors as 

N 

w(z) = u>fid(z) + ^2 onei(z) (2) 

i=l 

and the expected error in the recovered amplitudes is 
given by cr(ai) = \ (assuming the true model is rea- 
sonably close to the original fiducial model which was 
used to calculate the Fisher matrix.) 



II. CHOOSING A PRIOR 

Lacking any prior knowledge of the possible w(z) func- 
tions, all of the eigenvectors arc informative, no mat- 
ter how large the error bars. However, this is unduly 
pessimistic. Even without a physical model for w(z), 
we would still be surprised if it was much too positive 
(w 3> 1/3) or much too negative (w <C —1). Such pre- 
conceptions constitute our theoretical priors, and in prin- 
ciple allow us to roughly separate the cigen modes into 
those which are informative relative to the priors and 
those that are not. If the error bars on an eigenmode al- 
low much too positive or too negative w(z), then we have 
not really learned anything. 

We need some way of quantifying our theoretical bi- 
ases; however, this is necessarily subjective and begs the- 
oretical input. One possible choice is to use a weak Gaus- 
sian prior on the amplitude of w{z) in any given bin, e.g. 
u>i = wo ±cr p , and assume that the bins are uncorrelated. 
This would prevent w(z) from deviating too much from 
the fiducial model in any given bin. The problem is that 
in this case formally all the modes have the same impor- 
tance which does not reflect our actual prior assumptions. 
The choice to bin the parameter is implicitly motivated 
by an assumption that the parameter is in some sense 
smooth, that high frequency modes are less likely or less 
interesting than low frequency modes. A binning in ef- 
fect provides a sharp cutoff, giving equal weight to all 
modes below the Nyquist frequency and no representa- 
tion of modes with higher frequencies. It is perhaps more 
intuitive that the prior should have a more gentle transi- 
tion, whereby the prior probability of a mode is gradually 
decreased as its frequency increases. 

One alternative way of implementing priors on w(z) is 
to choose a correlation function describing fluctuations 
away from some fiducial model. Lacking a specific prior 
from fundamental theory, we propose treating the equa- 
tion of state as a random field evolving with a given corre- 
lation period (in time or redshift.) Much like the choice 
to bin in the first place, this prior is based on the as- 
sumption that the equation of state is evolving smoothly. 
While in this paper we focus on using correlations in red- 
shift, a more physical independent variable could be used, 
such as scale factor or proper time. Since the prior is re- 
ally only an extension of the binning choice, ideally one 
should use the same independent variable for both. 

In effect, the correlation function prior provides a tran- 
sition between high frequency oscillations, which are re- 
sisted by the prior, and the low frequency modes, which 
are unaffected. Providing a prior stabilizes the high fre- 
quency variances and allows us to focus on the more in- 
teresting low frequency modes. Also, as long as there are 
sufficient bins compared to the correlation length, the 
prior largely wipes out dependence on the precise choice 
of binning. 

In practice, we need to construct the covariance ma- 
trix associated with the prior (which will be inverted and 
added to the Fisher matrix from the observations.) The 
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starting point is to assume that the deviations of the 
equation of state from its fiducial model (e.g. w = — 1) 
can be encapsulated in a correlation function: 

£ w (\z - z'\) = ((w(z) - Wfid (z)){w(z') - w &d (z'))) . (3) 

Such a form implicitly assumes independence of transla- 
tions in redshift; that is, that there is no preferred epoch 
for variations from the fiducial equation of state. (Though 
such a preferred epoch could be built into the fiducial 
model itself.) As stated above, we choose redshift as our 
independent variable, but the same considerations would 
apply to another choice, such as the scale factor. 

Let us assume the i th bin is from Zi to Zi + A, and 
for simplicity we will assume that all bins have the same 
width A = Zi+i — Zi. The equation of state averaged over 
each bin is given by 

Wi = — J dzw(z). (4) 

We can write the variation from the fiducial model aver- 
aged over the bin as, Swi = Wi — Wfa- Calculating the 
covariance matrix of the binned equation of state is then 
straightforward: 

(Swidwj) — ^ J dz J dz' £ w (\z - z'\). (5) 

All that remains is to calculate this covariance matrix for 
a given functional form of the correlation function. 

We assume that it has a characteristic correlation red- 
shift distance z c after which it falls off; for example, let us 
assume £ w (z) = £ W (Q)/(1 + (z/z c ) 2 ). In this case we can 
perform the integrals analytically, using the relations, 




The covariance between two bins of width A separated 
by z = \zi — Zj | can be shown to be 

z 2 

(Swidwj) = £ w (0)-^[x+ tan 1 x+ + tan 1 X- 
-2xtan _1 x + log(l + x 2 ) 
-ilog(l + 4)-ilog(l + x 2 _)] (8) 

where x, = z/z c , x + = (z + A)/z c and .t_ = (z — A)/z c . 
Once the bin width and the correlation length are set, 
the correlation matrix will only depend on z, the distance 
between the bins of interest. 

Note that the variance of the mean equation of state 
over all the bins follows directly from this by taking A 



to be the entire redshift interval and z = 0. As long as 
z c /z max -C 1, one can show that 

,W i-W dz > TrUOK 
0"m = / dz / — iw{Z-z)~ . (9) 

This is essentially the variance at any given point, £«,(()), 
divided by the number of effective degrees of freedom, 
Weft = Zmax/ z c- While this expression is appropriate in 
the z c j Zmax *C 1 limit, in practice we use the full expres- 
sion, since the corrections can be significant for larger 
correlation lengths (z c > 0.4). 

In practice, our inputs to the prior were the error in 
the mean, <r m , and the correlation distance, z c . We chose 
the error in the mean to be of order a m ~ 0.2 — 0.5, which 
seemed representative of our present observational uncer- 
tainty in the mean. We tried correlation lengths in the 
range 0.1 < z c < 0.4, where the upper limit was begin- 
ning to be strong enough to impact the observed modes 
for the SN. The SN results were the first affected because 
they are the only probe we considered which constrained 
(however weakly) the shorter wavelength DE modes. 

The resulting prior takes the form, 

V prioi oc cxp (-^« ue - ™.f d )C^Vf uc - w$*)j 

(10) 

where Cy = (SwiSwj) is the correlation function t; w (\z — 
z'\) integrated over the bins (Eq. (8)). This prior natu- 
rally constrains the high frequency modes without over 
constraining the lower frequency modes that are typically 
probed by experiments. 

As an aside, note that assuming no bin correlations, 
i.e., a purely diagonal matrix for the prior, is equivalent to 
using a delta function for the correlation prior, e.g. £(z) = 
^qS(z). (Any finite correlation distance will automatically 
generate some off-diagonal correlation.) In such a case, 
one finds (SwiSwj) = £o<5y/A and the mean variance 
is a 2 n = £o/ 2 max- Thus, assuming a fixed total range, 
the bin variance should grow with the number of bins 
(5w 2 ) = of n -/Vbi ns to keep the mean variance unchanged. 

III. COMPARING DARK ENERGY PROBES 
USING PCA WITH THE SMOOTHNESS PRIOR 

A. Figures of merit 

In order to compare the information content of var- 
ious probes, one needs to decide on a so-called Figure 
of Merit (FOM): a scalar quantity that one should op- 
timise. The most often used scalar quantity relates to 
the determinant of the Fisher or curvature matrix, which 
is effectively a measure of the volume of the parameter 
space. For example, this (or its square root) is used for 
the DETF figure of merit. There are good reasons for try- 
ing to minimize the volume of phase space, particularly in 
the context of comparing the Bayesian evidences of dif- 
ferent models. In these calculations, an Occam's factor 
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plays a role, which is basically the ratio of the total pos- 
sible volume of parameter space to the volume allowed 
by the data. Thus minimizing this volume factor would 
allow us to rule out models with higher significance. 

While the parameter space volume is a very natural 
measure, it is not the only possible measure. Another 
possibility is to focus on the trace of the inverse Fisher 
matrix, which corresponds to minimizing the sum of the 
variances, also known as the mean squared error (MSE). 
This is dominated by the least constrained modes, so 
minimizing it will tend to spread the information out 
among many different eigenvectors. This is useful for a 
number of reasons. One is to minimize the errors in re- 
constructing the true behavior of the parameter from 
observations, which is directly quantified by the MSE. 
More modes also provide the opportunity for checks of 
the consistency of the data. If the number of well con- 
strained modes is less than the number of parameters 
in our model, there will always exist degeneracies in the 
model; this makes it impossible to check the consistency 
of data within that model. 

For example, in the one-parameter ACDM model, a 
single dark energy observation is sufficient to constrain 
the DE density and we could use the CMB acoustic scale 
to constrain it. If another observation, such as a single 
well-measured SN eigenmode, is available, we can check 
that it gives a consistent answer for that one parameter. 
But for a two parameter theory (e.g. constant w or A with 
curvature) these data must be combined to find a unique 
model, and no longer can be used as a consistency check. 
And for a three parameter theory (like linear evolving 
w(z)), more SN eigenmodes or other types of data must 
be found to find a unique set of model parameters. Thus, 
it pays to constrain more modes than your theory has 
parameters. 

It is straightforward to use the MSE as a FOM to eval- 
uate the value added by a given experiment. Ideally, we 
want to minimize the difference between the true and 
estimated w(z), 

mse = - ™r uc ) 2 = E(^ st - «r uc ) 2 , (ii) 

i i 

the latter following from the orthonormality of the eigen- 
vector basis. In the absence of any priors, this is expected 
to be 

(MSE) = ^2 <T (a l ) 2 = TrF" 1 . (12) 

i 

This will be dominated by the poorest determined modes, 
which is why having some kind of prior is essential. Tak- 
ing into account priors, F is replaced in this expression 
by C _1 + F. The MSE arising from the prior alone is 
MSE = TrC = N£ w (0) where £ w (0) is the variance in a 
single bin; this is independent of the shape of the corre- 
lation function. Adding more experimental data reduces 
the MSE as the well determined modes contribute less. 
The amount by which the MSE is reduced can be seen 
as a measure of how informative the experiment is. 



For the MSE criteria, it will generally be more effective 
to maximize the number of modes which are reasonably 
well determined (a{a,i) < a p ) than to have a smaller 
number of better determined eigen modes. To reduce the 
MSE, it is most effective to focus on the modes which 
have larger error bars. Much different conclusions can re- 
sult from using a different FOM [21]. For example, if we 
used the volume of the parameter error ellipses, rather 
than the MSE, as the figure of merit, a factor of two re- 
duction in any error bar would lead to the same reduction 
in the volume, regardless of how well determined the er- 
ror bar was initially. If the parameter was already tightly 
constrained, the volume could be reduced substantially 
while the MSE would be largely unchanged. 



B. Application to future data 

For our forecasts, we assume the following probes: 
LSST for WL and GC, Planck for CMB, and a Joint 
Dark Energy Mission (JDEM) for SN. For the galaxy 
distribution we assume the most optimistic LSST-likc 
survey with several billion galaxies distributed out to 
z = 3. We then divide the total galaxies into ten and 
six photometric bins for the calculation of GC and WL 
respectively. The survey parameters were adopted from 
the recent review of the LSST collaboration [22]. Namely, 
we use / s ky = 0.5, Nq = 50 gal/arcmin 2 for both 
WL and counts; the shear uncertainty is assumed to be 
7rms = 0.18 + 0.042z, and the photometric rcdshift un- 
certainty is given by a(z) = 0.03(1 + z). We only use 
the information from scales that are safely in the linear 
regime (corresponding to k < O.lh/Mpc). For CMB we 
include the Planck temperature and polarization spectra 
and their cross-correlation; for the SN we assume the de- 
tection of 2000 SN distributed out to a redshift of 1.7. 
Details of the assumptions for the experiments and the 
calculation of the Fisher matrices can be found in [23] 
and in [24]. 

The eigenvectors and eigenvalues clearly depend on 
how we treat the other cosmological parameters. For the 
CMB constraints, we use the CMB data alone, assum- 
ing a flat universe, and marginalize over other cosmolog- 
ical parameters including the dark matter density, baryon 
density, spectral index, Hubble parameter, and optical 
depth. For the SN, we marginalize over the dark matter 
density and the value of the intrinsic SN magnitude M, 
but since the dark matter density is likely to be well de- 
termined, we use the CMB Fisher matrix marginalized 
over dark energy parameters as the prior. Formally, one 
should add instead the full CMB and SN Fisher matri- 
ces to find out the joint dark energy eigenmodes; how- 
ever, by first marginalizing over the CMB dark energy 
parameters, one obtains a clearer picture of what the SN 
are actually measuring, as otherwise the first CMB mode 
would contaminate the SN information. We do similarly 
for the GC and WL auto-correlations, but for GC, we 
marginalize also over a different possible bias in each of 
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the ten photometric rcdshift bins. (The assumption of 
independent biases is conservative and significantly re- 
duces the effectiveness of the GC spectra to tell us about 
dark energy.) Finally, for the cross-correlations, we use 
the CMB, WL and the GC data to give priors for the 
bias and other cosmological parameters. 

Fig. 1 shows the spectra of eigenvalues for the various 
data sets, their cross correlations and the combined data. 
The shaded region provides a rough threshold based on a 
diagonal prior with a m > 0.3, and any eigenmodes above 
this are taken to be informative. Many of the experiments 
provide multiple independent modes, with upwards often 
informative modes for the combined data set. For a cor- 
related prior, the threshold to be informative depends on 
the frequency of the mode and this can mean that fewer 
informative modes arc found. 
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FIG. 1: The eigenvalues (a~ 2 (ai)) for the raw Fisher matri- 
ces (no priors assumed) for different experiments. The grey 
shaded region shows the diagonal prior of a m > 0.3. 

The best determined eigenvectors are shown in Fig. 2 
for the various types of measurements we consider. We 
plot only those with eigenvalues above a given thresh- 
old. The predictions vary significantly depending on the 
experiment. The CMB gives one well determined mode, 
which corresponds to the angular distance to the last 
scattering surface [25]. The SN and the GC correlations 
give a larger number of well determined modes, which 
is a benefit of the range in redshifts of the data. The 
cross-correlation only gives a single marginally deter- 
mined mode, but it probes to higher redshifts than many 
of the others. Finally, we show the well determined eigen- 
vectors for the total Fisher matrix. When combined, the 
experiments can probe higher frequency modes than they 
can probe independently. This is because the combina- 
tion is sensitive to the differences in the individual eigen- 
vectors. As might be expected from our choice of fiducial 
model (wq = — 1), the experiments are most sensitive to 
lower redshifts. 



FIG. 2: The best determined eigenvectors for different kinds 
of experiments. No priors have been assumed and the am- 
plitudes are normalized to unity. Only the first five modes 
are plotted if more than five modes are well-constrained. The 
modes are shown, in the order from better constrained to 
worse, as black solid, red dashed, blue dash-dot, green dash- 
dot-dot and magenta dot curves. With additional modes, we 
can begin to probe higher frequency changes in w(z); however, 
all the data is primarily constraining only the low redshift be- 
haviour (z < 1). 



How much the experiments improve the MSE depends 
on the choice of correlation function for the prior. In Ta- 
ble 1, we vary fu,(0) and z c while holding the prior con- 
straint on the mean fixed, a m = 0.3. If we choose small z c , 
all the bins are effectively uncorrelatcd and the variance 
in each bin is large. In this case, the MSE is large and 
all of the eigenmodes are constrained by the prior at the 
same level. As we let z c get larger, the variance in each 
bin shrinks, as does the MSE. Modes with wavelengths 
smaller than z c become tightly constrained by the prior. 

For the case of a diagonal prior, these numbers are 
easy to interpret in terms of the number of newly con- 
strained modes. For the prior alone, the MSE is given 
by Abi ns x (8w 2 ) = Ab ins cr^ = 144. Adding the forecast 
data, some of the modes will have significantly reduced 
errors bars compared to the prior alone. As the prior is di- 
agonal, the new eigenmodes were also eigenmodes of the 
prior with the same variance, given by AbinsCm = 3.6. 
The reduction of the MSE thus roughly tells us how 
many new modes are more constrained compared to the 
prior. The SN data, for example, reduce the MSE from 
144 to 125, which means there is new information on 
(144 — 125)/3.6 ~ 5 modes, beyond what was assumed 
in the prior. This can be seen in the spectrum of eigen- 
modes, shown in Fig. 1, counting the number of modes 
with amplitudes above the prior threshold shown by the 
horizontal line. 
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diagonal 


z c = 0.1 


2c = 0.4 


no data 


144.0 


35.0 


11.5 


SN 


124.9 


[5.3] 


27.4 


8.2 


CMB 


138.4 


[1.6] 


31.0 


8.6 


GC 


126.1 


[5.0] 


26.3 


6.7 


WL 


128.5 


[4.3] 


24.7 


5.6 


GCxCMB 


124.9 


[5.3] 


24.7 


7.2 


WLxCMB 


136.1 


[2.2] 


29.9 


8.9 


WLxGC 


117.3 


[7.4] 


21.3 


5.1 


Total 


102.6 


[11.5] 


16.7 


3.0 



TABLE I: The mean squared error (which is related to the 
number of well constrained modes) for various priors and ex- 
periments. The priors are normalized so that cr m = 0.3 and 
the 'diagonal' prior means no correlations exist between bins. 
For this diagonal prior, we give in brackets the inferred num- 
ber of modes meaningfully constrained by the observations. 



For a correlated prior, the interpretation of the MSE 
is less straightforward for two reasons. First, the prior 
eigenvalues arc no longer all the same; as some start out 
higher, improving information on these modes has less of 
an impact on the total MSE than in the diagonal case. 
More fundamentally, in this case the eigen vectors change 
when the forecast data are included, so it is no longer pos- 
sible to have a one-to-one correspondence of modes with 
and without the forecast data. We can, however, put a 
lower bound on the number of new modes by dividing 
by the largest prior variance; this will generally be for 
the homogenous mode, which as above is set by the con- 
straint on the variance of the mean (iVbinsCm)- As can be 
inferred from the table, the number of new modes esti- 
mated in this way is significantly reduced, especially as 
the prior correlation length is increased. 

The relative value of different experiments depends 
somewhat on what was assumed to be known already. 
For small z c , the SN experiments are one of the most in- 
formative of the single data set we consider. However, as 
z c is increased, the higher frequency modes probed by the 
SN are already strongly constrained by the prior, reduc- 
ing the impact of the SN data; other observations, such 
as weal lensing, measure lower frequency modes, and are 
not as affected by the change in the prior. 



IV. ADDITIONAL APPLICATIONS OF THE 
PCA METHOD 

A. Reconstructing w(z) 

Given the measurements, one would like to reconstruct 
a best estimate of the equation of state history w(z). In 
our Bayesian approach, this simply means finding the 
model parameters which maximize the posterior proba- 
bility distribution, which is the prior distribution times 



the likelihood expected from the observational Fisher ma- 
trix. Some of the eigen modes will be determined by the 
data, and some will be determined by the prior. In partic- 
ular, if the Fisher matrix error for an eigenmode, a (at), 
is much smaller than that from the prior, the parame- 
ter will be well determined and independent of the prior 
assumptions. Modes where a(cn) from the observational 
Fisher matrix are much larger than the prior error will 
be poorly determined and the amplitudes will revert to 
their value at the peak of the prior, which is the fiducial 
model. 

Earlier discussion of PCA's took a more frequentist 
approach to reconstructing w(z) [10], where the function 
was reconstructed by using a subset of the PCA modes. 
Deciding how many modes are kept requires minimizing 
a 'risk function' which is effectively the mean squared 
error, separated into a variance and a bias contribution. 
The size of the bias depends on how much the true un- 
derlying model differs from the fiducial model which is 
assumed; thus the number of modes one keeps depends 
on what you think the underlying model is, which is obvi- 
ously unknown. Thus, there is some ambiguity in the pre- 
scription; it would however make sense to base the bias 
not on a single model, but on the ensemble of potential 
models. This is precisely what we attempt to quantify 
with our choice of Bayesian prior. 

Not too surprisingly, the Bayesian and frequentist ap- 
proaches should yield similar reconstructions. The modes 
excluded from the frequentist estimator are precisely 
those modes where the Bayesian prior overwhelms the 
information from the observations. In regions where the 
data are good, the true model will be reconstructed 
well; in regions where the data are poor, the reconstruc- 
tion reverts to the fiducial model assumed on theoretical 
grounds. 

The main difference is that the frequentist estimator 
explicitly drops the more poorly determined modes from 
the beginning, which means no matter how large they are 
measured to be, they will not affect the reconstruction. 
Since these modes are dropped, the reconstructed error 
bars for modes where there is no information actually 
appear smaller, which is clearly incorrect. In the Bayesian 
case, however, the errors instead revert to the theoretical 
uncertainty, which is more representative of our degree 
of ignorance. 

The predicted MSE gives a good idea of the expected 
errors in the reconstruction, assuming the true model is 
typical of those allowed by the prior. In that sense, the 
MSE is a useful figure of merit. From the MSE values 
in Table I, one can see that the reconstruction will be 
very poor in the case of the diagonal prior we have as- 
sumed. This is because the prior knowledge we assumed 
was weak, and even after the addition of data, much un- 
certainty remains (i. e. MSE is still large). The correlated 
(non-diagonal) prior gives a much smaller MSE, and so 
should give a much better reconstruction assuming the 
true model is typical given the prior (e. g. does not have 
high frequency oscillations.) This seems to be a signifi- 
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cant advantage of the correlated prior. 



B. PCA for data compression 

A key advantage to the PCA approach is that it re- 
tains information about all the modes which the obser- 
vations are able to constrain, and thus can be used as 
a compressed form of data which can be applied to any 
theoretical parameterization that can be translated into 
our binning choice. Rather than separately calculating 
the constraints for each parameterization from the orig- 
inal data, this can be done directly from the principal 
component representation, and this will apply both to 
forecasts as well as to the measured constraints. 

We can easily emulate any other w(z) parameterization 
to find the predicted parameter error bars without regen- 
erating the Fisher matrices from scratch. All we need is 
to expand the derivative with respect to the given param- 
eter in terms of our eigenmode basis. That is, for each of 
the new parameters, we find the coefficients af such that 



dw(z) 
dp a 



£«?*(*)• 



(13) 



We can then find the Fisher matrix in the new basis by 
simple rotation and holding fixed the remaining dark en- 
ergy parameters 



Fab &i ^ijOCj 



A,; 



(14) 



Obviously it helps if the fiducial model is the same for the 
various paramcterizations, and constant w is a reasonable 
choice. We have checked that this prescription works with 
a few percent accuracy for the constant wq parameteri- 
zation, and for the parameterization linear in the scale 
factor, and expect the same for any function which can 
be well approximated by our binning. The eigen basis 
is also a useful basis for searching for the peak of the 
likelihood surface, since the various amplitudes become 
dccorrclatcd. 



V. CONCLUSIONS 

The choice of prior is a critical one in the study of dy- 
namical dark energy; it tells us how informative a given 
experiment will be and must also play a role in answering 
the fundamental questions about whether dark energy is 



dynamical or whether a modified gravity model should be 
preferred. The principal component approach attempts 
to remove the reliance on particular simplified paramc- 
terizations, such as a constant or linearly varying w(z), 
and replace them with more generic assumptions about 
the behavior of dark energy. Here we have focused on a 
phenomcnological approach to the prior, but in principle 
one wants to base this on theoretical models. For exam- 
ple, if one had a measure on the possible quintessence 
potentials, one could translate this into a measure for 
w{z). 

Once a prior is agreed, the principle components and 
MSE provide a basis for planning a coherent strategy 
to study dark energy. The principle components demon- 
strate any experiment's individual sensitivity and poten- 
tial for adding information orthogonal to other experi- 
ments. We can then use this information to investigate 
how effective different experiments or combinations of ex- 
periments will be in reducing MSE from its value based 
on the present data. The MSE figure of merit naturally 
focusses on those degrees of freedom we know the least 
about, resulting in more constrained modes which can 
provide consistency checks for a theoretical model. 

Finally, the principal components represent an effec- 
tively lossless means of compressing the observed data, 
which can then be used to constrain any theoretically 
motivated dark energy history without repeating the ob- 
servational analysis. This is particularly relevant when 
experiments are sensitive to modes which are orthogonal 
to the simplest dark energy paramcterizations; in such 
cases, evaluating only the naive dark energy parameters 
can greatly undervalue what is actually learned in an ex- 
periment. 
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Note: This is an extended version of a paper which 
first appeared in 2005; here we have expanded the discus- 
sion to clarify the key aspects of the paper, especially the 
correlation function prior and using the MSE to quantify 
the new information provided by experiments. A number 
of papers addressing the principal component approach 
to dark energy have appeared recently [26] and raised the 
profile of this approach; in this context, we felt it worth 
expanding and clarifying our original discussion of these 
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